{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Input data\n",
    "\n",
    "Working with sequential or time series data requires a consistent and regular spacing between observations.\n",
    "Uneven or irregularly spaced data can lead to ambiguous results and unreliable forecasts. For this reason, **skforecast** strictly enforces the use of **regular indices**.\n",
    "\n",
    "To ensure reproducibility and clarity in forecasting tasks, skforecast only allows two types of index:\n",
    "\n",
    "+ **DatetimeIndex with frequency**: A time-based index with a defined and regular frequency (e.g., daily, monthly).\n",
    "\n",
    "+ **RangeIndex with step**: A default integer index, regularly spaced.\n",
    "\n",
    "Other index types (such as `DatetimeIndex` without frequency, or custom indices) are not supported, and their use will raise an error.\n",
    "\n",
    "## Number of time series\n",
    "\n",
    "The **skforecast** library offers a **variety of forecaster** types, each tailored to specific requirements such as single or multiple time series, direct or recursive strategies, or custom predictors. Regardless of the specific forecaster type, all instances share the same API.\n",
    "\n",
    "| Forecaster                      | Single series | Multiple series | Recursive strategy | Direct strategy | Probabilistic prediction | Time series differentiation | Exogenous features | Window features |\n",
    "|:--------------------------------|:-------------:|:---------------:|:------------------:|:---------------:|:------------------------:|:---------------------------:|:------------------:|:---------------:|\n",
    "|[ForecasterRecursive]            |✔️||✔️||✔️|✔️|✔️|✔️|\n",
    "|[ForecasterDirect]               |✔️|||✔️|✔️|✔️|✔️|✔️|\n",
    "|[ForecasterRecursiveMultiSeries] ||✔️|✔️||✔️|✔️|✔️|✔️|\n",
    "|[ForecasterDirectMultiVariate]   ||✔️||✔️|✔️|✔️|✔️|✔️|\n",
    "|[ForecasterRNN]                  |✔️|✔️||✔️|✔️||✔️||\n",
    "|[ForecasterStats]                |✔️||✔️||✔️|✔️|✔️||\n",
    "\n",
    "[ForecasterRecursive]: ../user_guides/autoregressive-forecaster.html\n",
    "[ForecasterDirect]: ../user_guides/direct-multi-step-forecasting.html\n",
    "[ForecasterRecursiveMultiSeries]: ../user_guides/independent-multi-time-series-forecasting.html\n",
    "[ForecasterDirectMultiVariate]: ../user_guides/dependent-multi-series-multivariate-forecasting.html\n",
    "[ForecasterRNN]: ../user_guides/forecasting-with-deep-learning-rnn-lstm.html\n",
    "[ForecasterStats]: ../user_guides/forecasting-sarimax-arima.html"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Libraries and data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Libraries\n",
    "# ==============================================================================\n",
    "import pandas as pd\n",
    "from lightgbm import LGBMRegressor\n",
    "from skforecast.datasets import fetch_dataset\n",
    "from skforecast.recursive import ForecasterRecursive"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">╭────────────────────────────────────── <span style=\"font-weight: bold\">h2o</span> ───────────────────────────────────────╮\n",
       "│ <span style=\"font-weight: bold\">Description:</span>                                                                     │\n",
       "│ Monthly expenditure ($AUD) on corticosteroid drugs that the Australian health    │\n",
       "│ system had between 1991 and 2008.                                                │\n",
       "│                                                                                  │\n",
       "│ <span style=\"font-weight: bold\">Source:</span>                                                                          │\n",
       "│ Hyndman R (2023). fpp3: Data for Forecasting: Principles and Practice(3rd        │\n",
       "│ Edition). http://pkg.robjhyndman.com/fpp3package/,https://github.com/robjhyndman │\n",
       "│ /fpp3package, http://OTexts.com/fpp3.                                            │\n",
       "│                                                                                  │\n",
       "│ <span style=\"font-weight: bold\">URL:</span>                                                                             │\n",
       "│ https://raw.githubusercontent.com/skforecast/skforecast-                         │\n",
       "│ datasets/main/data/h2o.csv                                                       │\n",
       "│                                                                                  │\n",
       "│ <span style=\"font-weight: bold\">Shape:</span> 204 rows x 2 columns                                                      │\n",
       "╰──────────────────────────────────────────────────────────────────────────────────╯\n",
       "</pre>\n"
      ],
      "text/plain": [
       "╭────────────────────────────────────── \u001b[1mh2o\u001b[0m ───────────────────────────────────────╮\n",
       "│ \u001b[1mDescription:\u001b[0m                                                                     │\n",
       "│ Monthly expenditure ($AUD) on corticosteroid drugs that the Australian health    │\n",
       "│ system had between 1991 and 2008.                                                │\n",
       "│                                                                                  │\n",
       "│ \u001b[1mSource:\u001b[0m                                                                          │\n",
       "│ Hyndman R (2023). fpp3: Data for Forecasting: Principles and Practice(3rd        │\n",
       "│ Edition). http://pkg.robjhyndman.com/fpp3package/,https://github.com/robjhyndman │\n",
       "│ /fpp3package, http://OTexts.com/fpp3.                                            │\n",
       "│                                                                                  │\n",
       "│ \u001b[1mURL:\u001b[0m                                                                             │\n",
       "│ https://raw.githubusercontent.com/skforecast/skforecast-                         │\n",
       "│ datasets/main/data/h2o.csv                                                       │\n",
       "│                                                                                  │\n",
       "│ \u001b[1mShape:\u001b[0m 204 rows x 2 columns                                                      │\n",
       "╰──────────────────────────────────────────────────────────────────────────────────╯\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>y</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>date</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1991-07-01</th>\n",
       "      <td>0.429795</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1991-08-01</th>\n",
       "      <td>0.400906</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1991-09-01</th>\n",
       "      <td>0.432159</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1991-10-01</th>\n",
       "      <td>0.492543</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1991-11-01</th>\n",
       "      <td>0.502369</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2008-02-01</th>\n",
       "      <td>0.761822</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2008-03-01</th>\n",
       "      <td>0.649435</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2008-04-01</th>\n",
       "      <td>0.827887</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2008-05-01</th>\n",
       "      <td>0.816255</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2008-06-01</th>\n",
       "      <td>0.762137</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>204 rows × 1 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                   y\n",
       "date                \n",
       "1991-07-01  0.429795\n",
       "1991-08-01  0.400906\n",
       "1991-09-01  0.432159\n",
       "1991-10-01  0.492543\n",
       "1991-11-01  0.502369\n",
       "...              ...\n",
       "2008-02-01  0.761822\n",
       "2008-03-01  0.649435\n",
       "2008-04-01  0.827887\n",
       "2008-05-01  0.816255\n",
       "2008-06-01  0.762137\n",
       "\n",
       "[204 rows x 1 columns]"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Download data\n",
    "# ==============================================================================\n",
    "data = fetch_dataset(\n",
    "    name=\"h2o\", raw=True, kwargs_read_csv={\"names\": [\"y\", \"date\"], \"header\": 0}\n",
    ")\n",
    "data[\"date\"] = pd.to_datetime(data[\"date\"], format=\"%Y-%m-%d\")\n",
    "data = data.set_index(\"date\")\n",
    "data = data.asfreq(\"MS\")\n",
    "data"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Train and predict using input with DatetimeIndex and frequency"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Index type      : <class 'pandas.core.indexes.datetimes.DatetimeIndex'>\n",
      "Index frequency : <MonthBegin>\n"
     ]
    }
   ],
   "source": [
    "# Index type and frequency\n",
    "# ==============================================================================\n",
    "print(f\"Index type      : {type(data.index)}\")\n",
    "print(f\"Index frequency : {data.index.freq}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2008-07-01    0.861239\n",
       "2008-08-01    0.871102\n",
       "2008-09-01    0.835840\n",
       "2008-10-01    0.938713\n",
       "2008-11-01    1.004192\n",
       "Freq: MS, Name: pred, dtype: float64"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Create and fit forecaster\n",
    "# ==============================================================================\n",
    "forecaster = ForecasterRecursive(\n",
    "                 estimator = LGBMRegressor(random_state=123, verbose=-1),\n",
    "                 lags      = 5\n",
    "             )\n",
    "\n",
    "forecaster.fit(y=data['y'])\n",
    "\n",
    "# Predictions\n",
    "# ==============================================================================\n",
    "forecaster.predict(steps=5)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Train and predict using input with RangeIndex"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>y</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.429795</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.400906</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.432159</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.492543</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.502369</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>199</th>\n",
       "      <td>0.761822</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>200</th>\n",
       "      <td>0.649435</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>201</th>\n",
       "      <td>0.827887</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>202</th>\n",
       "      <td>0.816255</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>203</th>\n",
       "      <td>0.762137</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>204 rows × 1 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            y\n",
       "0    0.429795\n",
       "1    0.400906\n",
       "2    0.432159\n",
       "3    0.492543\n",
       "4    0.502369\n",
       "..        ...\n",
       "199  0.761822\n",
       "200  0.649435\n",
       "201  0.827887\n",
       "202  0.816255\n",
       "203  0.762137\n",
       "\n",
       "[204 rows x 1 columns]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Data without datetime index\n",
    "# ==============================================================================\n",
    "data = data.reset_index(drop=True)\n",
    "data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Index type : <class 'pandas.core.indexes.range.RangeIndex'>\n",
      "Index step : 1\n"
     ]
    }
   ],
   "source": [
    "# Index type and step\n",
    "# ==============================================================================\n",
    "print(f\"Index type : {type(data.index)}\")\n",
    "print(f\"Index step : {data.index.step}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "204    0.861239\n",
       "205    0.871102\n",
       "206    0.835840\n",
       "207    0.938713\n",
       "208    1.004192\n",
       "Name: pred, dtype: float64"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Fit - Predict\n",
    "# ==============================================================================\n",
    "forecaster.fit(y=data['y'])\n",
    "forecaster.predict(steps=5)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "skforecast_py12",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.11"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": false,
   "sideBar": true,
   "skip_h1_title": true,
   "title_cell": "Tabla de contenidos",
   "title_sidebar": "Tabla de contenidos",
   "toc_cell": false,
   "toc_position": {
    "height": "calc(100% - 180px)",
    "left": "10px",
    "top": "150px",
    "width": "165px"
   },
   "toc_section_display": true,
   "toc_window_display": false
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "position": {
    "height": "144.391px",
    "left": "1478px",
    "right": "20px",
    "top": "126px",
    "width": "350px"
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
